Markov Decision Processes with Time-Varying Geometric Discounting
نویسندگان
چکیده
Canonical models of Markov decision processes (MDPs) usually consider geometric discounting based on a constant discount factor. While this standard modeling approach has led to many elegant results, some recent studies indicate the necessity time-varying in certain applications. This paper model infinite-horizon MDPs with factors. We take game-theoretic perspective – whereby each time step is treated as an independent maker their own (fixed) factor and we study subgame perfect equilibrium (SPE) resulting game well related algorithmic problems. present constructive proof existence SPE demonstrate EXPTIME-hardness computing SPE. also turn approximate notion epsilon-SPE show that exists under milder assumptions. An algorithm presented compute epsilon-SPE, which upper bound complexity, function convergence property factor, provided.
منابع مشابه
Markov decision processes with exponentially representable discounting
We generalize the geometric discount of finite discounted cost Markov Decision Processes to “exponentially representable” discount functions, prove existence of optimal policies which are stationary from some time N onward, and provide an algorithm for their computation. Outside this class, optimal “N-stationary” policies in general do not exist.
متن کاملContinuous Markov equilibria with quasi-geometric discounting
We prove that the standard quasi-geometric discounting model used in dynamic consumer theory and political economics does not possess continuous Markov perfect equilibria (MPE) if there is a strictly positive lower bound on wealth. We also show that, at points of discontinuity, the decision maker strictly prefers lotteries over the next period’s assets. We then extend the standard model to have...
متن کاملContinuous time Markov decision processes
In this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission con...
متن کاملSolving Structured Continuous-Time Markov Decision Processes
We present an approach to solving structured continuous-time Markov decision processes. We approximate the the optimal value function by a compact linear form, resulting in a linear program. The main difficulty arises from the number of constraints that grow exponentially with the number of variables in the system. We exploit the representation of continuous-time Bayesian networks (CTBNs) to de...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i10.26413